home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Business Master (4th Edition)
/
The Business Master - 4th Edition.iso
/
files
/
wordmisc
/
pcindx11
/
pcindx.exe
/
HELP.004
< prev
next >
Wrap
Text File
|
1991-08-20
|
28KB
|
566 lines
FILE menu:
The FILE is divided into three sections: Single Word Functions,
Phrase Functions, and Miscellaneous.
The FILE menu has 12 available selections: Extract Single Words,
Extract Capitalized Words, Build Single Word Index, Word
Frequency, Spinoff Unique Words, Extract Phrases, Extract
Personal Names, Build Phrase Index, View Index on Screen, Print
Index to Printer, Save Defaults, and Go to DOS.
┌───────────────────────────────────────────────────────────────┐
│ ┌──────────────────────────┐ │
│ │ ──Single Word Functions──│ │
│ │ Extract Single Words │ │
│ │ Extract Capitalized Words│ │
│ │ Build Single Word Index │ │
│ │ Word Frequency │ │
│ │ Spinoff Unique Words │ │
│ │ ──Phrase Word Functions──│ │
│ │ Extract Phrases │ │
│ │ Extract Personal Names │ │
│ │ Build Phrase Index │ │
│ │ ─────Miscellaneous───────│ │
│ │ View Index on Screen │ │
│ │ Print Index to Printer │ │
│ │ Save Defaults │ │
│ │ Go to DOS │ │
│ └──────────────────────────┘ │
│ │
│ PC─INDEX 4.0─Index Generator Copyright 1989─91 Help Software │
└───────────────────────────────────────────────────────────────┘
This menu is broken down into three categories. The first
category is Single Word Functions, the second section contains
Phrase Functions, and the last is Miscellaneous Functions.
Extract Single Words
Extract Single Words is the first item in the menu. It is also
the first step performed in creating a single word index. It's
function is to extract each individual word from a document and
record it.
This option will extract all words in a document, one at a time,
and record them in sorted order along with the page number that
they occur on.
Before you begin with the Extract Words selection, you need to
select the proper document type from the DOCUMENT menu.
Select the Extract Single Words option from the FILE menu. You
should now see a new window asking you for an input filename, an
output filename, the page size, the first page number to start
indexing on, and the first page number to use and several other
options.
For the input filename, enter the name of the document that you
want to index and press enter. For the output filename type any
name you want and press enter. The output file is not the index,
but a sorted list of all words in the document and the page
numbers that they occur on. It is recommended that you use the
same name as the document with '.srt' as the extension.
The entry for page size is only used if you are using a Text or
ASCII file. If you are using a word processor supported directly
by PC─INDEX then you can ignore this entry. For a list of word
processors supported by PC─INDEX, look in the Document menu.
The next entry is Start Indexing on Page. This entry allows you
to skip a few pages at the beginning of a document before the
indexing starts. This will let you skip a title page, table of
contents, or anything else at the beginning of a document that
you don't want to index.
The First Page Number to use setting will determine what page
number PC─INDEX will use as the first page number. This entry
can be used with the Start Indexing on Page setting so that you
can start indexing on page four, but the first page number will
be page one.
The rest of the selections can be broken into two types. Which
word list to use and what type of conversion to perform. One
selection can be made from the choices in each of the two groups.
The three choices on the left determine what words will be
included in the index. Here are the options and the effect that
they will have on an index.
Don't Use any Word List: When this option is selected every word
in the document will be included in the index. Common words like
'a', 'and', 'the', etc. will be indexed using this option.
Use Include Word List: When the Use Include Word list option is
selected, PC─INDEX will compare the extracted word to the include
word list. If a match is found, the extracted word will be
included in the extracted word list and the index.
Use Discard Word List: When the Use Discard Word List option is
selected, PC─INDEX will compare the extracted word to the discard
word list. If a match is found, the extracted word will be
discarded and will not be included in the extracted word list or
the index.
For consistency, PC─INDEX can convert all words to be the same
case as they are being extracted. If you want to do any
conversion, you have three choices. Convert words to UPPER CASE
will convert all words to upper case, Convert words to lower case
will convert all words to lower case, and Convert words to UPPER
& lower case will convert the first letter in the word to upper
case and the rest of the word to lower case. If you select No
Conversion then no conversion will take place.
The completed window should look like this:
┌───────────────────────────────────────────────────────────────┐
│ Input File Name: (Name of Document to process) │
│ pci.doc │
│ │
│ Output File Name: │
│ pci.srt │
│ │
│ Page Size Start Indexing on Page First Page Number to use │
│ 60 5 1 │
│ │
│ Don't Use Any Word List X Perform No Conversion on Word │
│ │
│ Use Include Word List Convert Word to UPPER Case │
│ │
│X Use Discard Word List Convert Word to lower Case │
│ │
│ Convert Word to UPPER/lower │
│ │
└───────────────────────────────────────────────────────────────┘
When you have finished entering the filenames and other
information, press F10 to begin processing.
Extract Capitalized Words
The Extract Capitalized Words selection works in exactly the same
manner as Extract Single Words, except that it only extracts
capitalized words (i.e. names).
Build Single Word Index
Build Single Word Index is the final step in creating a single
word index. It takes the file created by the 'Extract Single
Words' selection and edited by the 'Edit Extracted Word File'
selection and creates an index.
Select 'Build Single Word Index' from the FILE menu. You will be
asked for the input file and output file. Enter the name of the
extracted word file that you created with the Extract Words
process. This file should have '.SRT' as the filename extension.
Next you will be asked what name you want to use for the output
file. This is the filename of the index . It is recommended
that you use the original document name with the extension
'.NDX'.
The Wildcard Description file is only used if you are processing
a group of files together. If you indexed a group of files then
use the same wildcard description filename here. It contains
information that PC─INDEX needs to complete the index.
Next, PC─INDEX wants to know the page length (how many lines per
page) you want to use. The default setting is 66 which is the
proper setting for letter size paper. If you are using legal
size paper, the proper setting would be 88. This number does not
need to match the lines per page setting you used when you
selected 'Extract Words'. Most laser printers will only output
60 lines per page. If you will be printing the index on a laser
printer, you will probably want to set this option to 60.
The next item to fill in is the page width. Here you will enter
the total number of characters that will fit on one line of your
printer. The maximum width accepted by PC─INDEX is 132
characters. The number next to page width in reverse video is
the calculated width required for the settings you have selected.
This number (required width) must be smaller than the Page Width
setting or an error will occur.
Next, PC─INDEX asks you the number of columns you would like the
output to be in. You will be able to produce an index up to four
columns wide. An example of a two column index is included at
the end of this document.
The column width is the next entry. This entry controls the
width of each column in the index. The minimum allowable width
is 30 characters and the maximum is 99.
The number of spaces between columns can range from 1 to 9
characters.
Next fill in the top, bottom, left, and right margins to the
settings that you wish.
The completed input window should look like this:
┌───────────────────────────────────────────────────────────────┐
│ Input File Name: │
│ pci.srt │
│ │
│ Output File Name: │
│ pci.ndx │
│ │
│ Wildcard Description File Name: (Leave Blank if not needed) │
│ │
│ │
│ Page Size Page Width (Columns) Number of Columns │
│ 66 80 78 2 │
│ Column Width Space Between Columns Top Margin │
│ 30 3 5 │
│ Bottom Margin Left Margin Right Margin │
│ 5 10 5 │
└───────────────────────────────────────────────────────────────┘
When you have finished entering the filenames and other
information, press F10 to begin processing.
You should see a status box which tells you the number of words
to be processed, the number of words actually processed, the
letter of the alphabet currently being processed, percentage
completed, and the elapsed time.
When this is finished, you will be returned to the main menu and
the completed index is contained in the text file under the name
you entered. If you wish to view the file you can select View
Index from the File Menu. If you want to print the index to a
printer select Print Index from the File Menu. Since the index
file is an ASCII file, you could also load it into almost any
word processor and edit it further if you wish.
Word Frequency List
The Word Frequency List selection builds a word frequency list.
This list contains all unique words found in a document in
alphabetical order and the number of times that each word was
used. This list is built from an extracted single word file. If
you want a complete listing of all words, be sure to extract
words using the 'Don't use any Word List' option.
Enter the name of the extracted word file that you want to
process for the Input File Name. If you have not already created
an extracted single word file, then you will need to do this
first.
Enter any name you want for the output file name. This file will
be an ASCII text file when finished. For consistency, it is
recommended that you use the document name with the extension
'.frq'.
The minimum word count that you are asked for will allow you to
set a minimum number of occurrences for a word to be included in
the word frequency file. In other words, if you want only the
most frequently used words in the word frequency list, you might
enter 20 or some other large number in the Minimum Word Count
entry. This way only words occurring 20 or more times would be
included in the word frequency list.
Spinoff Unique Words
The Spinoff Unique Words selection creates a file of phrases from
an extracted single word file. This can be helpful when creating
a customized list of phrases.
This option will through an extracted word file and write out all
unique words to a phrase file. By editing the '.srt' file with
the Edit Extracted word file (found under the Edit Menu) you can
mark or un─mark individual words. Then when you spin off a list
you can spin off either the marked words or the un─marked words.
First select Spinoff List from the File menu. Enter the Input
File Name. It must be an extracted single word file. Next enter
the Output File Name. This will be a phrase file and you should
name it with a '.dbf' extension. Finally enter 'a' or 'i' to
spin off either active or inactive words. Press F10 and
processing will begin.
You can change the default file names that PC─INDEX uses for
phrase list by using the Edit Word List Filenames under the Edit
menu.
Extract Phrases
Extract Phrases will search through a document and find all
occurrences of a list of phrases. It is the first step performed
in creating a phrase index. It's function is to extract each
individual phrase from a document and record it.
Before you begin with the Extract Phrases selection, you need to
select the proper document type from the Document menu.
Select the Extract Phrases option from the FILE menu. You should
now see a new window asking you for an input filename, an output
filename, the page size, the first page number to start indexing
on, and the first page number to use.
For the input filename, enter the name of the document that you
want to index and press enter. You can press F2 here to select a
file from a list. For the output filename type any name you
want and press enter.
The output file is not the index, but a sorted list of phrases in
the document and the page numbers where they were found. It is
recommended that you use the same name as the document with
'.srt' as the extension.
The entry for page size is only used if you are using a text or
ASCII file. If you use a word processor supported directly by
PC─INDEX then you can ignore this entry. For a list of word
processors supported by PC─INDEX, look in the Document menu.
The next entry is Start Indexing on Page. This entry allows you
to skip a few pages at the beginning of a document before the
indexing starts. This will let you skip a title page, table of
contents, or anything else that you don't want to index.
The First Page Number to use setting will determine what page
number PC─INDEX will use as the first page number. This entry
can be used with the Start Indexing on Page setting so that you
can start indexing on page four, but the first page number will
be page one. This will be useful if you want to skip a few pages
at the beginning of a document.
The completed window should look like something like this
┌───────────────────────────────────────────────────────────────┐
│ Input File Name: (Name of Document to process) │
│ pci.doc │
│ │
│ Output File Name: │
│ pci.srt │
│ │
│ Page Size Start Indexing on Page First Page Number to use │
│ 66 4 1 │
└───────────────────────────────────────────────────────────────┘
When you have finished entering the filenames and other
information, press F10 to begin processing.
Extract Personal Names
This menu selection is new to this version of PC─INDEX. Extract
Personal Names will go through a document finding personal names,
first and last names and writing them out to a phrase file. This
file can then be used to create a name index or merged with
another phrase file to create a more comprehensive index that
includes names.
This selection is not guaranteed to find all names in a document,
but it is a good starting point. Usually this option will
extract capitalized words that are not really names rather than
omit names.
In order to use this option correctly, it will be helpful to
understand what is happening. PC─INDEX scans a document until it
finds at least two capitalized words in a row. If two
capitalized words are found, then the first word is looked up in
the Personal Name File. If the name is found then this sequence
of capitalized words is assumed to be a personal name.
The Personal Name File contains over 12,000 first names. You may
want to browse through the list using the Edit Personal Name File
(found in the Edit List Menu) to make sure that it contains names
you know you need.
When you select Extract Personal Names, you will see a screen
asking you for an Input File Name, an Output File Name, the
Maximum Number of Words in a Name, and information regarding the
surname (last name).
For the input file name enter the name of the document you want
to extract names from. For the output file name enter any name
you want. It is recommended that you use a file name with the
extension '.dbf'.
The maximum number of words in a name can be any number from 2 to
6. There must be at least 2 words in a name (a first and last
name) and no more than 6.
The last three choices tell PC─INDEX how last names can be
recognized. These choices were added to help PC─INDEX to find
names faster and more accurately.
The fastest and most accurate method for extracting names is Last
Name contains ALL CAPS. In order to use this option, all
surnames must contain all capital letters and names that are not
surnames cannot contain all caps. If it isn't possible to use
all caps in last names then use one of the other options. If it
doesn't matter to you whether last names are all caps or not,
then it is recommended that you use all caps. The increase in
speed and accuracy will be significant.
The next option, Last Name is not ALL CAPS tells PC─INDEX that no
names will contain only capital letters. This is the second
fastest and second most accurate method for extracting names.
The last option, Last Name may or may not be ALL CAPS should be
selected if the way capital letters used in names is not
consistent.
The completed screen should look something like this:
┌──────────────────────────────────────────────────────┐
│ Input File Name: (Name of Document to process) │
│ pci.doc │
│ │
│ Output File Name: │
│ pci.dbf │
│ │
│ Maximum Number of Words in a Name (2 ─ 6) │
│ 3 │
│ │
│ X Last Name is ALL CAPS │
│ │
│ Last Name is not ALL CAPS │
│ │
│ Last Name may or may not be ALL CAPS │
└──────────────────────────────────────────────────────┘
When you have finished entering the filenames and other
information, press F10 to begin processing.
You should see a status box which tells you the number of words
to be processed, the number of words actually processed, the
number of names found, percentage completed, and the elapsed
time.
After this is complete you can (and probably should) browse
through and edit the names that were just extracted by selecting
Edit Extracted Name File from the Edit List Menu. This will
allow you to correct names if necessary or to delete entries
completely.
You may want to merge the extracted name file with a phrase file
so an index will contain both names and phrases. Since the
extracted name file is actually a phrase file, you can use Merge
Phrase Files (found in the Merge Files Menu) to accomplish this.
Build Phrase Index
Build Phrase Index is the final step in creating a phrase index.
Build Phrase Index takes the file created by the 'Extract
Phrases' selection and creates a phrase index.
Select 'Build Phrase Index' from the FILE menu. You will be
asked for the input file and output file. Enter the name of the
extracted word file that you created with the Extract Words
process. This file should have '.SRT' as the filename extension.
Next you will be asked what name you want to use for the output
file. This is the filename for the final index. It is
recommended that you use the original document name with the
extension '.NDX'.
The Wildcard Description file is only used if you are processing
a group of files together. If you indexed a group of files then
use the same wildcard description filename here. It contains
information that PC─INDEX needs to complete the index.
Next, PC─INDEX wants to know the page length (how many lines per
page) you want to use. The default setting is 66 which is the
proper setting for letter size paper. If you are using legal
size paper, the proper setting would be 88. This number does not
need to match the lines per page setting you used when you
selected 'Extract Words'. Many laser printers normally print 60
lines per page. If you will be printing the index on a laser
printer, you will probably want to set this option to 60.
The next item to fill in is the page width. Here you will enter
the total number of characters that will fit on one line of your
printer. The maximum width accepted by PC─INDEX is 132
characters. The number next to page width in reverse video is
the calculated width required for the settings you have selected.
This number (required width) must be smaller than the Page Width
setting or an error will occur.
Next, PC─INDEX asks you the number of columns you would like the
output to be in. You will be able to produce an index up to four
columns wide if your columns are small enough. An example of a
two column phrase index is included at the end of this document.
The column width is the next entry. This entry controls the
width of each column in the index. The minimum allowable width
is equal to the longest phrase in the phrase list that you used,
and the maximum is 99.
The number of spaces between columns can range from 1 to 9.
Next fill in the top, bottom, left, and right margins to the
settings that you wish.
The completed input window should look something like this:
┌───────────────────────────────────────────────────────────────┐
│ Input File Name: │
│ pci.srt │
│ │
│ Output File Name: │
│ pci.ndx │
│ │
│ Wildcard Description File Name: (Leave Blank if not needed) │
│ │
│ │
│ Page Size Page Width (Columns) Number of Columns │
│ 66 80 78 2 │
│ Column Width Space Between Columns Top Margin │
│ 30 3 5 │
│ Bottom Margin Left Margin Right Margin │
│ 5 10 5 │
└───────────────────────────────────────────────────────────────┘
When you have finished entering the filenames and other
information, press F10 to begin processing .
You should see a status box which tells you the number of words
to be processed, the number of words actually processed, the
letter of the alphabet currently being processed, percentage
completed, and the elapsed time.
When this is finished, you will be returned to the main menu and
the completed index is contained in the text file that you named.
If you wish to view the file you can select View Index from the
File Menu and enter the name of the index that you just created.
. If you want to print the index, select Print Index from the
File Menu. Since the index is an ASCII file, you could also
load it into most word processors and edit it further if you
wish.
View Index on Screen
View Index on Screen lets you see how the index you created
looks. You will probably want to browse the index before you
print it. You can use this selection to view any ASCII file.
Print Index to Printer
Print Index to Printer lets you print an index on your printer.
If you have a problem using this make sure that you have selected
the correct printer port.
You can change this using the Edit Default Settings List in the
Edit List Menu.
Save Defaults
Save Defaults saves the current settings in the DOCUMENT menu.
It will also save all numeric settings and default word list
filenames in the various dialogue boxes.
Go to DOS
Go to DOS allows you to perform DOS commands. Type EXIT to
return to PC─INDEX when you are finished.